Large Experiment and Evaluation Tool for WEKA Classifiers

نویسندگان

  • Dustin Baumgartner
  • Gürsel Serpen
چکیده

This paper presents a new Windows®-based software utility for WEKA, a data mining software workbench, to simplify large-scale experiment and evaluation with many algorithms and datasets in the classification context. The proposed tool, LEET (Large Experiment and Evaluation Tool) makes it possible to accomplish a variety of tasks that are presently rather difficult or impractical through the standard WEKA interfaces. This includes allowing comparison of classifiers across multiple experiments, tracking execution time, calculating diversity measures, and summarizing the characteristics of many datasets. We have tested and validated LEET as part of a study with 50+ machine learning classification/ensemble algorithms, 46 datasets, and calculation of a variety of performance measures. With WEKA providing the algorithm implementations, LEET facilitates the execution and evaluation of large-scale experiments with greater ease than any existing interface.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

MEKA: A Multi-label/Multi-target Extension to WEKA

Multi-label classification has rapidly attracted interest in the machine learning literature, and there are now a large number and considerable variety of methods for this type of learning. We present Meka: an open-source Java framework based on the well-known Weka library. Meka provides interfaces to facilitate practical application, and a wealth of multi-label classifiers, evaluation metrics,...

متن کامل

Microsoft Word - Finding More Non-supersingular Elliptic Curves for Pairing..

Ensemble learning algorithms such as AdaBoost and Bagging have been in active research and shown improvements in classification results for several benchmarking data sets with mainly decision trees as their base classifiers. In this paper we experiment to apply these Meta learning techniques with classifiers such as random forests, neural networks and support vector machines. The data sets are ...

متن کامل

ADABOOST ENSEMBLE ALGORITHMS FOR BREAST CANCER CLASSIFICATION

With an advance in technologies, different tumor features have been collected for Breast Cancer (BC) diagnosis, processing of dealing with large data set suffers some challenges which include high storage capacity and time require for accessing and processing. The objective of this paper is to classify BC based on the extracted tumor features. To extract useful information and diagnose the tumo...

متن کامل

Discretizing Continuous Features for Naive Bayes and C4.5 Classifiers

In this work, popular discretization techniques for continuous features in data sets are surveyed, and a new one based on equal width binning and error minimization is introduced. This discretization technique is implemented for the UCI Machine Learning Repository [7] dataset, Adult database and tested on two classifiers from WEKA tool [6], NaiveBayes and J48. Relative performance changes for t...

متن کامل

Performance Comparison of Naïve Bayes and J48 Classification Algorithms

Classification is an important data mining technique with broad applications. It classifies data of various kinds. Classification is used in every field of our life. Classification is used to classify each item in a set of data into one of predefined set of classes or groups. This paper has been carried out to make a performance evaluation of Naïve Bayes and j48 classification algorithm. Naive ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009